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Sup port vector machines: hype or halleluj ah? 
Kristin P. Bennett, Colin Campbell 

December 2000 ACM SIGKDD Explorations Newsletter, Volume 2 Issue 2 
Publisher: ACM Pres3 

Full text available: ^|pdf(1.26 MB) Additional Information: full citation. pjtiOSS. iMe.xJe.rms. 



Keywords: Support Vector Machines, kernel methods, statistical learning theory 



Research track paper: Rule extraction from linear support vector machines 
Glenn Fung, Sathyakama Sandilya, R. Bharat Rao 

August 2005 Proceeding of the eleventh ACM SIGKDD international conference on 
Knowledge discovery in data mining KDD '05 

Publisher: ACM Press 

Full text available: ^■]j|pdf(244.15 KB ) Additional Information: full citation , abstract , references , index terms 

We describe an algorithm for converting linear support vector machines and any other 
arbitrary hyperplane-based linear classifiers into a set of non-overlapping rules that, 
unlike the original classifier, can be easily interpreted by humans. Each iteration of the 
rule extraction algorithm is formulated as a constrained optimization problem that is 
computationally inexpensive to solve. We discuss various properties of the algorithm and 
provide proof of convergence for two different optimization c ... 



Keywords: linear classifiers, mathematical programming, medical decision-support, rule 
extraction 



Core Vector Machines: Fast SVM Training on Very Large Data Sets j 

Ivor W. Tsang, James T. Kwok, Pak-Ming Cheung 

April 2005 The Journal of Machine Learning Research, Volume 6 

Publisher: MIT Press 

Full text available: ^.Rdf(4j584SLKB). Additional Information: faiLcttaJiWi, afeatca.cS 

Standard SVM training has 0(m 3 ) time and 0(m 2 ) space complexities, where m is the 
training set size. It is thus computationally infeasible on very large data sets. By observing 
that practical SVM implementations only approximate the optimal solution by an iterative 
strategy, we scale up kernel methods by exploiting such "approximateness" in this paper. 
We first show that many kernel methods can be equivalently formulated as minimum ... 

A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs \ 
S. Sathiya Keerthi, Dennis DeCoste 

April 2005 The Journal of Machine Learning Research, Volume 6 
Publisher MIT Press 

Full text available: ^■gj.ftdj M 95.71 KB ) Additional Information: fujj^ajion, abstract 
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This paper develops a fast method for solving linear SVMs with L 2 loss function that is 

suited for large scale data mining tasks such as text classification. This is done by 
modifying the finite Newton method of Mangasarian in several ways. Experiments indicate 
that the method is much faster than decomposition methods such as SVM"9 ht , SMO and 
BSVM (e.g., 4-100 fold), especially when the number of examples is large. The paper also 
suggests ways of extending the metho ... 

Robust feature induction for support vector machines 
Rong Jin, Huan Liu 

July 2004 Proceedings of the twenty-first international conference on Machine 
learning ICML '04 

Publisher ACM Press 

Full text available: ^^RdJ{23 3,30 KB ) Additional Information: full citation , abstrac t, references 

The goal of feature induction is to automatically create nonlinear combinations of existing 
features as additional input features to improve classification accuracy. Typically, 
nonlinear features are introduced into a support vector machine (SVM) through a 
nonlinear kernel function. One disadvantage of such an approach is that the feature space 
induced by a kernel function is usually of high dimension and therefore will substantially 
increase the chance of over-fitting the training data. Another ... 

S parse bayesian learning and the relevance vector machine 
Michael E. Tipping 

September 2001 The Journal of Machine Learning Research, Volume l 
Publisher: MIT Press 

Full text available: ^g| pdf(999.88 KB) Additional Information: full citation , abstract , citings 

This paper introduces a general Bayesian framework for obtaining sparse solutions to 
regression and classification tasks utilising models linear in the parameters. Although this 
framework is fully general, we illustrate our approach with a particular specialisation that 
we denote the 'relevance vector machine' (RVM), a model of identical functional form to 
the popular and state-of-the-art 'support vector machine' (SVM). We demonstrate that by 
exploiting a probabilistic Bayesian learning framewor ... 

A fast iterative al g orithm for fisher discriminant using heterogeneous kernels 
Glenn Fung, Murat Dundar, Jinbo Bi, Bharat Rao 

July 2004 Proceedings of the twenty-first international conference on Machine 
learning ICML '04 

Publisher: ACM Press 

Full text available: ^"[j|pdfl217,86 KB) Additional Information: full citation , abstract , references 

We propose a fast iterative classification algorithm for Kernel Fisher Discriminant (KFD) 
using heterogeneous kernel models. In contrast with the standard KFD that requires the 
user to predefine a kernel function, we incorporate the task of choosing an appropriate 
kernel into the optimization problem to be solved. The choice of kernel is defined as a 
linear combination of kernels belonging to a potentially large family of different positive 
semidefinite kernels. The complexity of our algorithm d ... 

Keywords: Binary Classification, Heterogeneous Kernels, Linear Fisher Discriminant, 
Mathematical Programming 



Machine learning in automated text categorization 
Fabrizio Sebastiani 

March 2002 ACM Computing Surveys (CSUR), Volume 34 Issue 1 
Publisher ACM Press 

Full text available: Q pj3f(524JLLKBJ Additional Information: fuJLQtatiou, abstract. ffifj&Efincea. citings, [DdfiX Jfiims 

The automated categorization (or classification) of texts into predefined categories has 
witnessed a booming interest in the last 10 years, due to the increased availability of 
documents in digital form and the ensuing need to organize them. In the research 
community the dominant approach to this problem is based on machine learning 
techniques: a general inductive process automatically builds a classifier by learning, from 
a set of preclassified documents, the characteristics of the categories. ... 

Keywords: Machine learning, text categorization, text classification 
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9 Automatic generation of functional vectors using the extended finite state machine 
model 

Kwang-Ting Cheng, A. S. Krishnakumar 

January 1996 ACM Transactions on Design Automation of Electronic Systems (TODAES), 

Volume 1 Issue 1 
Publisher ACM Press 

Full text available: ^j^|.p_di (455,35 K B) Additional Information: full citation , abstract , references , citings , index term s, review 

We present a method of automatic generation of functional vectors for sequential circuits. 
These vectors can be used for design verification, manufacturing testing, or power 
estimation. A high-level description of the circuit in VHDL or C is assumed available. Our 
method automatically transforms the high-level description of a circuit in VHDL or C into 
an extended finite state machine (EFSM) model that is used to generate functional 
vectors. The EFSM model is a generalization of the traditi ... 

Keywords: automatic test generation, design verification, extended finite state machines, 
functional testing 



10 TNPACK— A truncated Newton minimization package for large-scale problems: L 
Al gorithm and usa ge 
Tamar Schlick, Aaron Fogelson 

March 1992 ACM Transactions on Mathematical Software (TOMS), Volume 18 Issue 1 
Publisher ACM Press 

Full text available: <EH[pdf( 1.54 MB) Additional Information: full citation, references , citings , index terms 




Keywords: nonlinear optimization, preconditioned conjugate gradient, sparse matrices, 
truncated Newton methods 



11 Algorithm 711; BTN: software for parallel unconstrained optimization 
Stephen G. Nash, Ariela Sofer 

December 1992 ACM Transactions on Mathematical Software (TOMS), Volume 18 Issue 4 
Publisher: ACM Press 

Full text available: pdf(l,(?4_M8.). Additional Information: fjulUataJLoji, abstract, references. ciUngs, iadiex_terms 

BTN is a collection of FORTRAN subroutines for solving unconstrained nonlinear 
optimization problems. It currently runs on both Intel hypercube computers (distributed 
memory) and Sequent computers (shared memory), and can take advantage of vector 
processors if they are available. The software can also be run on traditional computers to 
simulate the performance of a parallel computer. BTN is a general-purpose algorithm, 
capable of solving problems with a large numbers of variables and suitab ... 

Keywords: conjugate gradient method, nonlinear optimization, parallel computing, 
truncated-Newton method 





12 Industry/government track posters: Learning a complex metabolomic dataset usin g 
random forests and support vector machines 
Young Truong, Xiaodong Lin, Chris Beecher 

August 2004 Proceedings of the tenth ACM SIGKDD international conference on 
Knowledge discovery and data mining KDD '04 

Publisher: ACM Press 

Full text available: ^■^j pdfn 79.85 KB ) Additional Information: full citation , ab stract , references, index terms 

Metabolomics is the "omics" science of biochemistry. The associated data include the 
quantitative measurements of all small molecule metabolites in a biological sample. These 
datasets provide a window into dynamic biochemical networks and conjointly with other 
"omic" data, genes and proteins, have great potential to unravel complex human diseases. 
The dataset used in this study has 63 individuals, normal and diseased, and the diseased 
are drug treated or not, so there are three classes. The goal ... 

Keywords: metabolomics, missing data, random forest, support vector machines 





http://portal.acm.org/results.cfm?coll=ACM&dl=ACM&CFID=56223544& 10/22/2005 



Results (page 1): support vector machine and newton algorithm 



Page 4 of 5 



13 Pac-bayesian generalisation error bounds for gaussian process classification 
Matthias Seeger 

March 2003 The Journal of Machine Learning Research, volume 3 
Publisher MIT Press 

Full text available: ^Qj^flj&ZJJ K8) Additional Information: full citation, abstract, refere nces, index terms 

Approximate Bayesian Gaussian process (GP) classification techniques are powerful non- 
parametric learning methods, similar in appearance and performance to support vector 
machines. Based on simple probabilistic models, they render interpretable results and can 
be embedded in Bayesian frameworks for model selection, feature selection, etc. In this 
paper, by applying the PAC-Bayesian theorem of McAllester (1999a), we prove 
distribution-free generalisation error bounds for a wide range of approxima ... 

Keywords: Bayesian learning, Gaussian processes, Gibbs classifier, Kernel machines, 
PAC-Bayesian framework, convex duality, generalisation error bounds, sparse 
approximations 



14 Area and performance tradeoffs in floating-point divide and square-root 
im plementations 
Peter Soderquist, Miriam Leeser 

September 1996 ACM Computing Surveys (CSUR), Volume 28 Issue 3 
Publisher: ACM Press 

Full text available: ^iiditZ Q0.72 KB ) Additional Information: ful l cit at ion, abstract, references , eOings, index terms 

Floating-point divide and square-root operations are essential to many scientific and 
engineering applications, and are required in all computer systems that support the IEEE 
floating-point standard. Yet many current microprocessors provide only weak support for 
these operations. The latency and throughput of division are typically far inferior to those 
of floating-point addition and multiplication, and square-root performance is often even 
lower. This article argues the case for high-perf ... 

Keywords: FPU, SRT, area and performance tradeoffs, division, floating-point, square 
root 
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Technical reports 
SIGACT News Staff 

January 1980 ACM SIGACT News, Volume 12 Issue 1 
Publisher ACM Press 

Full text available: ^| Pdff 5.28 MB) Additional Information: full citation 



16 Pen computing: a technolo g y overview and a vision 
Andre Meyer 

July 1995 ACM SIGCHI Bulletin, Volume 27 Issue 3 
Publisher ACM Press 

Full text available: pjdJCLL4.MB} Additional Information: full citation , abstract , citings, index terms 

This work gives an overview of a new technology that is attracting growing interest in 
public as well as in the computer industry itself. The visible difference from other 
technologies is in the use of a pen or pencil as the primary means of interaction between a 
user and a machine, picking up the familiar pen and paper interface metaphor. From this 
follows a set of consequences that will be analyzed and put into context with other 
emerging technologies and visions.Starting with a short historic ... 
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C and tec: a language and compiler for dynamic code generation 

Massimiliano Poletto, Wilson C. Hsieh, Dawson R. Engler, M. Frans Kaashoek 

March 1999 ACM Transactions on Programming Languages and Systems (TOPLAS), 

Volume 21 Issue 2 
Publisher ACM Press 

Full text available: pdf(A7.1^KB j Additional Information: fu ll citation , absicaci. tsteisnseji. dtinos, iadexietms, tejge_w. 

Dynamic code generation allows programmers to use run-time information in order to 
achieve performance and expressiveness superior to those of static code. The 'CfTick C) 
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language is a superset of ANSI C that supports efficient and high-level use of dynamic 
code generation. 'C provides dynamic code generation at the level of C expressions and 
statements and supports the composition of dynamic code at run time. These features 
enable programmers to add dynamic code generation ... 

Keywords: ANSI C, compilers, dynamic code generation, dynamic code optimization 



18 Performance predictions for parallel diagonal-implicitly iterated Runge-Kutta methods j 
A. Thomas Rauber, Gudula Runger 

j U |y 1995 ACM SIGSIM Simulation Digest , Proceedings of the ninth workshop on 

Parallel and distributed simulation, Volume 25 Issue l 
Publisher IEEE Computer Society , ACM Press 
Full text available: r^jl 

^■]| pdf(967.00 KB) Additional Information: full cit ation, abstract, refere nces, in dex terms 

Pup|isher..Site. 

Many simulations in the natural sciences and engineering require the numerical solution of 
nonlinear differential equations. For this class of numerical methods, we propose an 
appropriate parallel computation model on distributed memory machines that supports the 
prediction of execution times. As a case study, we investigate the parallel implementation 
of the diagonal-implicitly iterated Runge-Kutta method, a solution method for stiff systems 
of ordinary differential equations. An implement ... 



Keywords: Intel iPSC/860, Runge-Kutta methods, digital simulation, distributed memory 
machines, nonlinear differential equations, parallel algorithms, parallel computation 
model, parallel diagonal-implicitly iterated Runge-Kutta methods, performance evaluation, 
prediction model, simulations 



A coordination langua g e for mixed task and and data parallel programs 
Thomas Rauber, Gudula Runger 

February 1999 Proceedings of the 1999 ACM symposium on Applied computing 

Publisher. ACM Press 

Full text available: pdf(1.39 MB) Additional Information: f ull citation , references, citin&s. ind ex terms 



Keywords: coordination language, message-passing programs, mixed task and data 
parallelism, parallel scientific computing 
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The connection machines CM-1 and CM-2: solvin g nonlinear network problems 
S. A. Zenios, R. A. Lasken 

June 1988 Proceedings of the 2nd international conference on Supercomputing 

Publisher. ACM Press 

Full text available: ^^Rdf( 1,09 MB ) Additional Information: full citation , a bstract , references , citing s, index terms 

Massively parallel computers — like the Connection Machines CM-1 and CM-2 — have 
demonstrated remarkable performance in applications like pattern recognition, database 
searching, dense linear algebra computations and so on. In this paper we discuss the use 
of these systems for the solution of nonlinear network optimization problems that appear 
is operations research, transportation, engineering design, financial modeling and other 
areas. We describe the implementation ... 
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